Probably Approximately Optimal Satisficing Strategies
نویسندگان
چکیده
A satisscing search problem consists of a set of probabilistic experiments to be performed in some order, seeking a satisfying connguration of successes and failures. The expected cost of the search depends both on the success probabilities of the individual experiments, and on the search strategy, which speciies the order in which the experiments are to be performed. A strategy that minimizes the expected cost is optimal. Earlier work has provided \optimizing functions" that compute optimal strategies for certain classes of search problems from the success probabilities of the individual experiments. We extend those results by providing a general model of such strategies, and an algorithm pao that identiies an approximately optimal strategy when the probability values are not known. The algorithm rst estimates the relevant probabilities from a number of trials of each undetermined experiment, and then uses these estimates, and the proper optimizing function, to identify a strategy whose cost is, with high probability , close to optimal. We also show that if the search problem can be formulated as an and-or tree, then the pao algorithm can also \learn while doing", i.e. gather the necessary statistics while performing the search.
منابع مشابه
Robust versus optimal strategies for determining the speed-accuracy tradeoff on two-alternative forced choice tasks
It has been proposed that animals and humans might choose a speedaccuracy tradeoff that maximizes reward rate. For this utility function the simple drift-diffusion model of two-alternative forced-choice tasks predicts a parameter-free optimal performance curve that relates normalized decision times to error rates. However, behavioral data indicate that only ≈ 30% of subjects achieve optimality,...
متن کاملRobust versus optimal strategies for two-alternative forced choice tasks.
It has been proposed that animals and humans might choose a speed-accuracy tradeoff that maximizes reward rate. For this utility function the simple drift-diffusion model of two-alternative forced-choice tasks predicts a parameter-free optimal performance curve that relates normalized decision times to error rates under varying task conditions. However, behavioral data indicate that only ≈ 30% ...
متن کاملSatisficing in Time-Sensitive Bandit Learning
Much of the recent literature on bandit learning focuses on algorithms that aim to converge on an optimal action. One shortcoming is that this orientation does not account for time sensitivity, which can play a crucial role when learning an optimal action requires much more information than near-optimal ones. Indeed, popular approaches such as upper-confidence-bound methods and Thompson samplin...
متن کاملOptimal Satisficing
Herbert Simon introduced the notion of satisficing to explain how boundedly rational agents might approach difficult sequential decision problems. His satisficing decision makers were offered as an alternative to optimizers, who have impressive computational capacities which allow them to maximize. There is no reason, however, why satisficers can not do their task optimally. In this paper, we p...
متن کاملScenario-Based Satisficing in Saving: A Theoretical and Experimental Analysis
Contrary to the models of deterministic life cycle saving, we take it for granted that uncertainty of one’s future is the essential problem of saving decisions. However, unlike the stochastic life cycle models, we capture this crucial uncertainty by a non-Bayesian scenario-based satisficing approach. Decision makers first form aspirations for a few relevant scenarios, and then search for saving...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Artif. Intell.
دوره 82 شماره
صفحات -
تاریخ انتشار 1996